micro-clustering property
Reviews: Flexible Models for Microclustering with Application to Entity Resolution
The following are the main strengths of the paper. It points out and defines an important property of cluster sizes that existing infinitely exchangeable clustering models do not satisfy. There could be many applications, including and not limited to entity resolution, that require this property to be satisfied. It proposes a framework for defining infinitely exchangeable clustering models that satisfy this micro-clustering property, and analyzes why the DP mixture model is an unsatisfactory instance of this class. It then proposes two specific and interesting instances of this class using specific distributions for the number of clusters and cluster sizes and derives reseating algorithms for these instances.
Technology: